Efficient data unit reuse method and system

ABSTRACT

The present disclosure relates to a data unit reuse method, where data is stored in a data unit in the form of a data block and the data block has a block ID. The method includes: successively reading each data block in a current data unit to search for a first specific data block whose block ID does not conform to a predetermined order; determining whether at least one data block whose block ID conforms to the predetermined order exists after the specific data block in the current data unit; when it exists, determining that the current data unit has been damaged, and when it does not exist, determining that a data block immediately previous to the specific data block is a data end.

CROSS-REFERENCE TO RELATED APPLICATIONS

This application is based upon and claims priority to Chinese PatentApplication No. 201811468470.5, filed on Dec. 3, 2018, the entirecontent of which is incorporated herein by reference.

TECHNICAL FIELD

The present application relates to computer technologies, and inparticular, to a method and system for reuse of a data unit.

BACKGROUND

During data reading/writing (for example, file reading/writing),especially during reading/writing of a log, a database, or a data file,it is usually needed to solve the problem of how to reuse an oldrecyclable data unit in writing. In a file reading-writing case, forexample, a file reading-writing solution in the prior art may be not toreuse an old file but to create a new file each time as required. Inthis solution, in order to guarantee that metadata of a file system issuccessfully written into a disk in an append writing operation,synchronization (SYNC), such as mutex locks, reading-writing locks,traffic lights, etc., usually needs to be performed in a writingoperation. Moreover, an old file needs to be deleted if the disk spaceis insufficient, which may result in inefficient performance.

In another file reading-writing solution, an EOD (end of data)identifier is appended to an end of data each time of writing, toimplement file reuse. In next time of writing, the file is searched foran EOD identifier and the found EOD identifier is used as an initialwriting position, and the length of the EOD identifier is used as anoffset. Then, a writing operation starts at a position offsettingforward from the initial writing position by the length of the EODidentifier, to overwrite the EOD identifier written last time. Duringfile reading, if the EOD identifier is encountered, it indicates thatthe end of the file has been reached.

However, using an EOD identifier to identify the file end requires thatdata of a certain size needs to be additionally written in each writingoperation. For example, according to different alignment requirements,data of 512 bytes or 4096 bytes needs to be additionally written, whichincreases the disk bandwidth. Furthermore, overwriting the EODidentifier by means of offsetting is not beneficial to, or even notsupported by, a mechanical disk or a network file system only supportingthe append writing operation.

In still another file reading-writing solution, the file content isfirst completely overwritten with invalid data before reuse of the oldfile. After completion of the writing, new content is written into thefile.

However, overwriting the old file requires high IO overheads. Inaddition, overwriting greatly delays a writing operation requiringperforming file switch.

SUMMARY

One aspect of the present disclosure provides a data unit reuse method,wherein data is stored in a data unit in the form of a data block andthe data block has a block ID. The method includes: successively readingeach data block in a current data unit to search for a first specificdata block whose block ID does not conform to a predetermined order;determining whether at least one data block whose block ID conforms tothe predetermined order exists after the specific data block in thecurrent data unit; when at least one data block whose block ID conformsto the predetermined order exists after the specific data block in thecurrent data unit, determining that the current data unit has beendamaged, and when no data block whose block ID conforms to thepredetermined order exists after the specific data block in the currentdata unit, determining that the data block immediately previous to thespecific data block is a data end.

Another aspect of the present disclosure provides a data unit reusemethod, which includes: acquiring a reusable data unit; renaming thereusable data unit with a new data unit name according to apredetermined order of data unit names; and writing a new data blockinto the reusable data unit according to a predetermined order of datablock IDs.

Another aspect of the present disclosure provides a data unit reuseapparatus, wherein data is stored in a data unit in the form of a datablock and the data block has a block ID. The apparatus includes: amemory; and a processor, coupled to the memory and configured to:successively read each data block in a current data unit to search for afirst specific data block whose block ID does not conform to apredetermined order; determine whether at least one data block whoseblock ID conforms to the predetermined order exists after the specificdata block in the current data unit; when at least one data block whoseblock ID conforms to the predetermined order exists after the specificdata block in the current data unit, determine that the current dataunit has been damaged, and when no data block whose block ID conforms tothe predetermined order exists after the specific data block in thecurrent data unit, determine that the data block immediately previous tothe specific data block is a data end.

Another aspect of the present disclosure provides a data unit reuseapparatus. The apparatus includes: a memory; and a processor, coupled tothe memory and configured to: acquire a reusable data unit; rename thereusable data unit with a new data unit name according to apredetermined order of data unit names; and write a new data block intothe reusable data unit according to a predetermined order of data blockIDs.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a schematic diagram of a use status of a recyclable data unitaccording to an embodiment.

FIG. 2 is a schematic diagram of a data unit according to an embodiment.

FIG. 3 is a flow chart of a method of writing into a recyclable dataunit according to an embodiment.

FIG. 4 is a flow chart of a method of writing into a reused old dataunit according to an embodiment.

FIG. 5 is a flow chart of a method of determining a writable position ina current data unit according to an embodiment.

FIG. 6 is a schematic diagram of a data unit reuse system according toan embodiment.

FIG. 7 is a schematic diagram of reading-writing cases of a reusablefile according to an embodiment.

DETAILED DESCRIPTION

Embodiments of the specification are described in detail below withreference to the accompanying drawings. It should be noted that thedescribed embodiments are examples rather than all of the embodimentsconsistent with the specification. Based on the described embodiments ofthe specification, other modified embodiments may be acquired by personsof ordinary skill in the art without creative effort and also belong tothe protection scope of the specification.

FIG. 1 is a schematic diagram of a use status of a recyclable data unitaccording to an embodiment. For illustrative purpose, a file is usedherein as an example of the recyclable data unit. However, persons ofordinary skill in the art should understand that any other recyclabledata units of different sizes, forms, and formats are all applicable tothe present disclosure. The recyclable data unit may be of a fixed orvariable length. Referring to FIG. 1, in a status (a), a new data unit102 is created and then data is written thereinto. In a status (b), itcontinues to write data until the data unit 102 is full, and then a nextdata unit 102 is created and data is written thereinto. Similarly, aftera certain quantity of data units 102 are created and data is writtenthereinto, the capacity reaches an upper limit, as shown in a status(c). In this case, an old data unit needs to be reused, as shown in astatus (d). In another case, the old data unit may be reused even whenthe capacity does not reach the upper limit (this case is not shown inthe figure), so as to save storage resources.

In an embodiment, reuse of the data unit may start from, for example,the earliest data unit. However, because the old data unit has dirtydata, a mechanism for differentiating new data from the dirty data maybe used.

FIG. 2 is a schematic diagram of a data unit 200 according to anembodiment. The data unit 200 may be a file, such as a log file, adatabase file, a data file, or the like. However, the present disclosureis not limited thereto, and other data units may also be applicable tothe present disclosure.

The data unit 200 may store one or more data blocks 202 of a fixed orvariable length. Each data block 202 may include a block head 204 and ablock body 206, and the block head 204 may include metadata. In anembodiment, the block head 204 may at least include, for example, ablock ID 208 and a block checksum 210. The block head 204 may alsoinclude other information. For example, when the data block 202 is of avariable length, the block head 204 may also include the number of bytesto be currently written (namely, the length of the block).

In an embodiment, as an identifier of the data block 202, the block ID208 can uniquely identify the data block and conforms to a predeterminedorder. For example, the block ID 208 may be set to monotonicallyincreasing or decreasing numbers. In an either increasing or decreasingorder, adjacent block IDs may be continuous or discontinuous. Forexample, the block IDs may be correlated timestamps, continuouslyincreasing/decreasing integers, or the like. The block checksum 210 mayinclude a checksum of the current data block 202. For example, the blockchecksum 210 may include a complete checksum of the block head 208 andthe block body 206. The block body 206 may include data, such as logs,data items, data, or the like. The block body 206 may include anaggregation of multiple data items, such as an aggregation of multiplelogs, multiple pieces of data, or the like.

FIG. 3 is a flow chart of a method 300 of writing into a recyclable dataunit according to an embodiment. For example, this data unit may be thedata unit 200 described in FIG. 2. The method 300 may include box 302:determining that there is a data block to be written. The data block maybe referred to as, for example, a to-be-written data block. For example,for a log file, box 302 may include: determining that a log is to bewritten into the log file. For another example, for a database file, box302 may include: determining that a data item is to be written into thedatabase file. For still another example, for a data file, box 302 mayinclude: determining that data is to be written into the data file. Inthe embodiment, the data block may be the data block 202 described inFIG. 2.

In box 304, it is determined whether a current data unit has a sizeenough to have the to-be-written data block to be written in. Forexample, if continuing to write the to-be-written data block in thecurrent data unit after the last written data block does not causeexceeding the size limit of the current data unit, it can be determinedthat the current data unit has a size enough to have the to-be-writtendata block to be written in. Otherwise, it can be determined that thecurrent data unit does not have a size enough to have the to-be-writtendata block to be written in. When it is determined that the current dataunit has a size enough to have the to-be-written data block to bewritten in (that is, a judgment result of box 304 is “yes”) in box 304,the method 300 goes to box 306. Otherwise, when it is determined thatthe current data unit does not have a size enough to have theto-be-written data block to be written in (that is, a judgment result ofbox 304 is “no”) in box 304, the method 300 goes to box 308. In box 308of the method 300, it is determined whether to create a new data unit orreuse an old data unit. This box may be implemented based on variousfactors. In an embodiment, it may be determined, based on differentscenarios, user preferences, system settings, and the like, whether tocreate a new data unit or reuse an old data unit. For example, if thedata units are expected to be retained as many as possible and/or aslong as possible for future use such as archiving, a new data unit ispreferably created. On the other hand, if it is only expected to keepthe data units that must be retained to reduce the use of storage space,for example, if a disk is non-exclusive, old data units may bepreferably reused. In an embodiment, when the total amount of data unitshas reached an upper limit, it is determined that an old data unit needsto be reused. Box 308 may be implemented based on a combination ofvarious considerations.

When it is determined that a new data unit needs to be created in box308, the method 300 goes to box 310: creating a new data unit as thecurrent data unit. Then, the method 300 goes to box 306. On the otherhand, when it is determined that an old data unit needs to be reused inbox 308, the method 300 goes to box 312. In box 312 of the method 300,an old data unit is reused as the current data unit, which will befurther described below. Then the method 300 goes to box 306, in whichthe to-be-written data block is written into the current data unit. Themethod 300 ends.

FIG. 4 is a flow chart of a method 400 of writing data into a reused olddata unit according to an embodiment. The method 400 includes box 402:acquiring a reusable data unit. In an embodiment, the acquiring thereusable data unit may include acquiring the earliest (namely, theoldest) reusable data unit. For example, when the data unit is a file ina disk, the acquiring the earliest reusable data unit may includeacquiring the earliest file from the disk. The acquiring the earliestfile may be implemented by using various solutions. In an embodiment,file names may be generated in order when files are initially generated.For example, the file names (namely, file IDs) may be generated by usingincreasing numbers. The file names are not limited to the increasingnumbers. For example, the file names may be generated in alphabetical orlexicographic order, or by using a combination of numbers and letters,provided that the files names are in order. Thus, the acquiring theearliest reusable file may be implemented by acquiring a file nameranking at the top in a corresponding sequence. For example, when thefile names are generated by using increasing numbers, the earliestreusable file can be acquired by acquiring the smallest number among allthe current file names.

Then, in box 404 of the method 400, the acquired earliest reusable dataunit is renamed with a new data unit name (for example, a file name)according to a certain order. The renamed data unit can be used as a newdata unit allowing writing in. Each time a data unit is reused, the dataunit is renamed to guarantee that the data unit name conforms to apredetermined order. In an embodiment, the data unit names (for example,file names) may be formed by increasing or decreasing numbers. Thenumbers used as the data unit names may be continuous or discontinuous,but the present disclosure is not limited thereto. For example, the dataunit names may be a combination of letters and numbers, provided thatthey conform to a predetermined order.

Then, in box 406: the method 400 includes writing to-be-written datainto the new data unit (namely, the renamed data unit). In anembodiment, the data may be written into the data unit 200 described inFIG. 2 in the form of the data block 202 described in FIG. 2. Forexample, for a log file or a data file, multiple logs (or multiplepieces of data) may be aggregated into a data block, and the block mayinclude a head portion. Referring to the foregoing description of theblock head 204 in FIG. 2, the head portion of the block may includemetadata. For example, the head portion of the block may include a blockID and check data, and reference is made to the block head 204 describedin FIG. 2 that at least includes the block ID 208 and the block checksum210 (and optionally includes other metadata). Each block may be of afixed or variable length. For example, when the block (for example, adata block) is of a variable length, the head portion may also includethe number of bytes to be currently written (namely, the length of theblock). As an identifier of the block, the block ID can uniquelyidentify the block and conforms to a predetermined order. For example,the block IDs may be monotonically increasing or decreasing numbers, andreference may be made to the foregoing description of the block ID 208in FIG. 2. In an either increasing or decreasing order, adjacent blockIDs may be continuous or discontinuous. For example, the block IDs maybe correlated timestamps, continuously increasing/decreasing integers,or the like. The check data may include a checksum of the current block,and reference may be made to the foregoing description of the blockchecksum 210 in FIG. 2. After the to-be-written data is written into therenamed data unit, the method 400 ends.

FIG. 5 is a flow chart of a method 500 of determining a writableposition in a current data unit according to an embodiment. Thedetermining the writable position in the current data unit generallyincludes determining an end position of the latest writing in. Themethod 500 starts from box 502. In box 504 of the method 500, a block IDof the last block written into the previous data unit is acquired as aninitial block ID. In one embodiment, a value of this block ID may beacquired by reading the previous data unit. In another embodiment, eachtime a file is switched, the value of the block ID may be recorded in anexternal storage apparatus or another storage position, and acquiredfrom the external storage apparatus or another storage position inrestart. In box 506 of the method 500, the current data unit is read anda next block in the current data unit is successively read. The currentdata unit may be the currently latest data unit. In an embodiment, thecurrently latest data unit may be determined according to data unitnames. For example, when the data unit names are increasing numbers, thecurrently latest data unit is a data unit having a data unit name of thecurrently largest number. The current data unit name may also berecorded in an external storage apparatus or another storage positioneach time the file is switched, and acquired from the external storageapparatus or another storage position in restart. In an embodiment, thecurrent data unit may be read at a fixed length in an offset increasingdirection. For example, for a block of a fixed length L, the reading thecurrent data unit at a fixed length may include reading data of thelength L in the current data unit each time successively. However, thepresent disclosure is not limited thereto. For example, the presentdisclosure is not limited to the reading at a fixed length. For avariable-length block, if the maximum length of the block is L, thereading the current data unit at a fixed length may include reading dataof the length L in the current data unit each time successively, andthen determining the length of a current block by parsing a head portionof the current block. Generally speaking, it is only required toguarantee that content read each time covers a head portion of a blockto guarantee that the head portion of the block can be parsed.

In box 508, it is determined whether the read block is a valid block. Inan embodiment, the determining whether the read block is a valid blockincludes checking a checksum in a head portion of the block. However,the present disclosure is not limited thereto, and other manners ofdetermining whether the block is valid may also be applicable.

When the read block is determined as a valid block in box 508, themethod 500 goes to box 510. In box 510 of the method 500, it isdetermined whether a block ID of the read block conforms to apredetermined order. For example, the determining whether a block ID ofthe read block conforms to a predetermined order may include determiningwhether the block ID of the read block increases (or decreases, whichdepends on system settings) as compared with that of the previous block.For example, an initial value of the block ID of the previous block maybe set to the initial block ID acquired in box 504, namely, the block IDof the last block written into the previous data unit. If it isdetermined that the block ID of the read block increases (or decreases)as compared with that of the previous block, the method 500 returns tobox 506 to read the next block. In an embodiment, the reading the nextblock may include successively parsing the next block in the currentdata unit in an offset increasing order. Specifically, parsing the nextblock includes parsing a head portion of the next block (and ifrequired, including parsing the content contained in the block).

When the read block is determined as an invalid block in box 508, themethod 500 goes to box 512. The determining the read block as an invalidblock may include determining that a checksum in the head portion of theblock fails to pass the check. In an embodiment, boxes 508 and 510 maybe combined, and the method 500 goes to box 512 if an invalid blockand/or a block whose block ID does not conform to the predeterminedorder is found.

In box 512 of the method 500, the current data unit is parsed todetermine whether a valid data block whose block ID conforms to thepredetermined order exists in the following content of the current dataunit. In an embodiment, the determining whether the data block is validmay include checking the data block based on the checksum. In anembodiment, the determining whether the block ID of the data blockconforms to the predetermined order may include determining whether theblock ID of the data block increases (or decreases) as compared with ablock ID of the previous block. For example, the method 500 maydetermine whether a block ID of each valid data block as parsed issmaller than a block ID of the previously read block from a file offsetof the current block to the end (for example, the EOD) of the currentdata unit.

If a judgment result of box 512 is yes, it indicates that the currentdata unit has been damaged. For example, it indicates that the currentdata block in the current data unit probably has been damaged. Thus, themethod 500 goes to box 514, in which the process exits the program orother corresponding actions are taken. If the judgment result of box 512is no, the method 500 goes to box 516, in which it is determined that adata end (for example, the EOD) has been reached in reading of thecurrent data unit.

In box 510 of the method 500, if it is determined that the block ID ofthe read block does not increase (or does not decrease, which depends onthe system setting) as compared with that of the previous block, themethod 500 goes to box 516, in which it is determined that the data end(for example, the EOD) has been reached in reading the current dataunit. Information about the data end may be recorded in an externalstorage apparatus or another storage position, and acquired from theexternal storage apparatus or another storage position in restart, so asto read/write and reuse these data units more efficiently.

In an embodiment, if a fixed-length block is used, it is even notrequired to check the fixed-length block. That is, box 508 may beomitted, and boxes 510 and 512 may be combined. For example, it may bedirectly determined whether the block ID of the current block increases(or decreases), and the next block is read if the block ID of thecurrent block increases (or decreases). If the block ID of the currentblock does not increase (or does not decrease), it is further determinedwhether a block whose block ID increases exists in the following contentof the current data unit. If it exists, it is determined that the filehas been damaged and corresponding processing is performed. If it doesnot exist, it is determined that the data end has been reached. Ofcourse, the fixed-length block may also be checked by using the processdescribed in FIG. 5 to achieve a better effect.

In the embodiments of the present disclosure, by reuse of a data unit,metadata in a file system does not need to be synchronized in writing,thus improving writing efficiency. In addition, based on a combinationof increasing (or decreasing) block IDs and a checksum, it can eliminatethe need to write an additional EOD field during each writing operation,it can also eliminate the need to perform overwriting, and it can avoiddata loss and reading dirty data. The embodiments of the presentdisclosure can further accurately detect a damaged data unit.

FIG. 6 is a schematic diagram of a data unit reuse system 600 accordingto an embodiment. The system 600 may include a processor 602, a memory604, a hard disk 606, a removable disk 608, and other storage 610. Thesystem 600 may further include a reusable data unit acquisitioncomponent 612, a reusable data unit renaming component 614, a reusabledata unit writing component 616, a reusable data unit reading component618, a valid data block determining component 620, a data block IDdetermining component 622, and the like. These components may be coupledtogether with a bus 630 and mutually communicate.

In an embodiment, the reusable data unit acquisition component 612 canperform the operation described in box 402 in FIG. 4. The reusable dataunit renaming component 614 can perform the operation described in box404 in FIG. 4. The reusable data unit writing component 616 can performthe operation described in box 406 in FIG. 4. The reusable data unitreading component 618 can perform the operation described in box 506 inFIG. 5. The valid data block determining component 620 can perform theoperation described in box 508 in FIG. 5. The data block ID determiningcomponent 622 can perform the operations described in boxes 504, 510 and512 in FIG. 5. The processor 602 can be configured to perform theoperations described in boxes 514 and 516 in FIG. 5.

In an embodiment, the reusable data unit acquisition component 612, thereusable data unit renaming component 614, the reusable data unitwriting component 616, the reusable data unit reading component 618, thevalid data block determining component 620, and the data block IDdetermining component 622 may be implemented by using differenthardware, software, or firmware. For example, in hardwareimplementation, these components may be implemented by using afield-programmable gate array (FPGA), an application-specific integratedcircuit (ASIC), a circuit, and the like. In software implementation,these components may be stored in the memory 604 of the system 600, andare executed by the processor 602. These components may also beimplemented by using a combination of the above-described means, whichall fall within the scope of the present disclosure.

FIG. 7 is a schematic diagram of reading-writing cases of a reusablefile according to an embodiment. FIG. 7(a) shows a case in which datablocks are of a fixed length and there is no damaged data block. FIG.7(b) shows a case in which data blocks are of a fixed length and thereis a damaged data block. FIG. 7(c) shows a case in which data blocks areof variable lengths and there is no damaged data block. FIG. 7(d) showsa case in which data blocks are of variable lengths and there is adamaged data block.

For the case (a) in which data blocks are of a fixed length and there isno damaged data block, when data blocks in the file are successivelyread until a data block 1001, these read data blocks are all valid andhave increasing block IDs. Afterwards, a block 0005 is read. Althoughthe block 0005 is valid, its block ID does not increase as compared withthe block ID of the previous data block 1001, and thus does not conformto a predetermined order. In this file, if no valid block whose block IDincreases as compared with the block 1001 exists after the block 0005,it indicates that the data block 1001 previous to the block 0005 is afile end.

For the case (b) in which data blocks are of a fixed length and there isa damaged data block, when data blocks in the file are successively readuntil a data block 1001, these read data blocks are all valid and haveincreasing block IDs. Afterwards, an invalid block or a valid data blockwhose block ID does not increase is read. In this file, if at least onevalid block 1003 whose block ID increases as compared with the block1001 exists after this invalid block, it indicates that the file hasbeen damaged.

For the case (c) in which data blocks are of variable lengths and thereis no damaged data block, when data blocks in the file are successivelyread until a data block 1001, these read data blocks are all valid andhave increasing block IDs. Afterwards, a block 0006 is read (a blockhead of a block 0005 has been at least partially overwritten by theblock 1001). Although the block 0006 is valid, its block ID does notincrease as compared with the block ID of the previous data block 1001,and thus does not conform to the predetermined order. In this file, ifno valid block whose block ID increases as compared with the block 1001exists after the block 0006, it indicates that the data block 1001previous to the block 0006 is a file end.

For the case (d) in which data blocks are of variable lengths and thereis a damaged data block, when data blocks in the file are successivelyread until a data block 1001, these read data blocks are all valid andhave increasing block IDs. Afterwards, an invalid block or a valid datablock whose block ID does not increase is read. In this file, if atleast one valid block 1003 whose block ID increases as compared with theblock 1001 exists after this invalid block/this data block whose blockID does not increase, it indicates that the file has been damaged.

It can be understood that, although different embodiments are describedby using a log, a database, or a data file as an example of a data unitherein, the present disclosure is not limited to a file reading-writingscenario and various reusable data units are also applicable to thepresent application. Likewise, although different embodiments aredescribed by using a disk as a position for storing the data unitsherein, the present disclosure is not limited to a disk reading-writingscenario and various storage manners are also applicable to the presentdisclosure.

In the present disclosure, the term “or” has an inclusive rather thanexclusive meaning. That is, unless otherwise indicated or clearly seenfrom the context, using “A” or “B” as a phrase “X” is intended to coverany natural collocation. That is, the using “A” or “B” as the phrase “X”can be realized by any of the following instances: using A as X, using Bas X, or using a combination of A and B as X. The terms “connection” and“coupling” may have the same meaning, which indicates that two devicesare electrically connected. In addition, the articles “a,” “an” and“the” used in the present disclosure and the appended claims shouldgenerally be understood as “one or more,” unless it is otherwise statedor clearly seen from the context that they indicate singular forms.

Various aspects or features are presented in the form of a system thatcan include several apparatuses, components, modules, and other similarobjects. It should be understood and appreciated that various systemsmay include additional apparatuses, components, modules, etc., and/ormay not include all the apparatuses, components, modules, etc. discussedwith reference to the accompanying drawings. A combination of all themeans may also be used.

Various illustrative logics, logic blocks, modules, and circuitsdescribed with reference to the embodiments disclosed herein can beimplemented or executed by using a general-purpose processor, a digitalsignal processor (DSP), an ASIC, a FPGA or another programmable logicdevice, a discrete gate or a transistor logic, a discrete hardwarecomponent, or any combination designed to implement functions of thepresent disclosure. The general-purpose processor may be amicroprocessor. In some embodiments, the processor may be anyconventional processor, controller, microcontroller, or state machine.The processor may also be implemented as a combination of computingdevices, such as a combination of a DSP and a microprocessor, multiplemicroprocessors, one or more microprocessors in conjunction with the DSPcore, or any other similar configuration. In addition, at least oneprocessor may include one or more modules used to perform one or moresteps and/or actions described above. For example, the foregoingembodiments described by using different methods can be implemented by aprocessor and a memory coupled to the processor. The processor can beconfigured to perform any step of any method described above or anycombination of the steps.

Moreover, steps and/or actions of the methods or algorithms describedwith reference to the aspects disclosed herein can be directlyimplemented in hardware, a software module executed by the processor, ora combination of the two. For example, the foregoing embodimentsdescribed by using different methods can be implemented by using acomputer readable medium storing computer program codes. The computerprogram codes, when executed by the processor/computer, perform any stepof any method described above or any combination of the steps.

Although the specification has been described in conjunction withspecific embodiments, many alternatives, modifications and variationswill be apparent to those skilled in the art. Accordingly, the followingclaims embrace all such alternatives, modifications and variations thatfall within the terms of the claims.

1. A data unit reuse method, wherein data is stored in a data unit in aform of a data block and the data block has a block ID, and the methodcomprises: successively reading each data block in a current data unitto search for a first specific data block whose block ID does notconform to a predetermined order; determining whether at least one datablock whose block ID conforms to the predetermined order exists afterthe specific data block in the current data unit; when at least one datablock whose block ID conforms to the predetermined order exists afterthe specific data block in the current data unit, determining that thecurrent data unit has been damaged, and when no data block whose blockID conforms to the predetermined order exists after the specific datablock in the current data unit, determining that a data blockimmediately previous to the specific data block is a data end.
 2. Themethod of claim 1, wherein: the predetermined order comprises anincreasing order or a decreasing order; and block IDs of continuous datablocks comprise continuous block IDs or discontinuous block IDs.
 3. Themethod of claim 1, wherein: reuse of a data unit comprises reuse ofmultiple data units, and the current data unit comprises a latest dataunit in the multiple data units.
 4. The method of claim 3, whereinsuccessively reading each data block in a current data unit to searchfor a first specific data block whose block ID does not conform to apredetermined order further comprises: acquiring a block ID of a lastblock written into a data unit immediately previous to the current dataunit as an initial ID; and based on the initial ID, successively readingeach data block in the current data unit to search for the firstspecific data block whose block ID does not conform to the predeterminedorder.
 5. The method of claim 1, wherein the data block furthercomprises check data, and successively reading each data block in acurrent data unit further comprises: checking each data block based onthe check data so as to determine whether the data block is a valid datablock, and when the data block is an invalid data block, determining thedata block as the first specific data block whose block ID does notconform to the predetermined order.
 6. The method of claim 1, whereinthe data unit is at least one of a file, a log, or a database.
 7. Themethod of claim 1, wherein the data block is a fixed-length data blockor a variable-length data block.
 8. A data unit reuse method,comprising: acquiring a reusable data unit; renaming the reusable dataunit with a new data unit name according to a predetermined order ofdata unit names; and writing a new data block into the reusable dataunit according to a predetermined order of data block IDs.
 9. The methodof claim 8, wherein acquiring a reusable data unit comprises acquiringan oldest data unit.
 10. The method of claim 8, wherein: thepredetermined order of data unit names comprises an increasing order ora decreasing order; and data unit names of continuous data unitscomprise continuous data unit names or discontinuous data unit names.11. The method of claim 8, wherein: the predetermined order of datablock IDs comprises an increasing order or a decreasing order; and blockIDs of continuous data blocks comprise continuous block IDs ordiscontinuous block IDs.
 12. The method of claim 8, wherein the datablock at least comprises a data block ID, check data, and a block body.13. The method of claim 12, wherein the data unit is at least one of afile, a log, or a database; and the block body comprises an aggregationof multiple data items.
 14. The method of claim 8, wherein the datablock is a fixed-length data block or a variable-length data block. 15.A data unit reuse apparatus, wherein data is stored in a data unit in aform of a data block and the data block has a block ID, and theapparatus comprises: a memory; and a processor, coupled to the memoryand configured to: successively read each data block in a current dataunit to search for a first specific data block whose block ID does notconform to a predetermined order; determine whether at least one datablock whose block ID conforms to the predetermined order exists afterthe specific data block in the current data unit; when at least one datablock whose block ID conforms to the predetermined order exists afterthe specific data block in the current data unit, determine that thecurrent data unit has been damaged, and when no data block whose blockID conforms to the predetermined order exists after the specific datablock in the current data unit, determine that a data block immediatelyprevious to the specific data block is a data end.
 16. The apparatus ofclaim 15, wherein: the predetermined order comprises an increasing orderor a decreasing order; and block IDs of continuous data blocks comprisecontinuous block IDs or discontinuous block IDs.
 17. The apparatus ofclaim 15, wherein: reuse of a data unit comprises reuse of multiple dataunits, and the current data unit comprises a latest data unit in themultiple data units.
 18. The apparatus of claim 17, wherein theprocessor being configured to successively read each data block in acurrent data unit to search for a first specific data block whose blockID does not conform to a predetermined order comprises the processorbeing configured to: acquire a block ID of a last block written into adata unit immediately previous to the current data unit as an initialID; and based on the initial ID, successively read each data block inthe current data unit to search for the first specific data block whoseblock ID does not conform to the predetermined order.
 19. The apparatusof claim 15, wherein the data block also comprises check data, and theprocessor being configured to successively read each data block in acurrent data unit further comprises the processor being configured to:check each data block based on the check data so as to determine whetherthe data block is a valid data block, and when the data block is aninvalid data block, determine the data block as the first specific datablock whose block ID does not conform to the predetermined order. 20.The apparatus of claim 15, wherein the data unit is at least one of afile, a log, or a database.
 21. The apparatus of claim 15, wherein thedata block is a fixed-length data block or a variable-length data block.22. A data unit reuse apparatus, comprising: a memory; and a processor,coupled to the memory and configured to: acquire a reusable data unit;rename the reusable data unit with a new data unit name according to apredetermined order of data unit names; and write a new data block intothe reusable data unit according to a predetermined order of data blockIDs.
 23. The apparatus of claim 22, wherein the processor beingconfigured to acquire a reusable data unit comprises the processor beingconfigured to acquire an oldest data unit.
 24. The apparatus of claim22, wherein: the predetermined order of data unit names comprises anincreasing order or a decreasing order; and data unit names ofcontinuous data units comprise continuous data unit names ordiscontinuous data unit names.
 25. The apparatus of claim 22, wherein:the predetermined order of data block IDs comprises an increasing orderor a decreasing order; and block IDs of continuous data blocks comprisecontinuous block IDs or discontinuous block IDs.
 26. The apparatus ofclaim 22, wherein the data block at least comprises a data block ID,check data, and a block body.
 27. The apparatus of claim 26, wherein thedata unit is at least one of a file, a log, or a database; and the blockbody comprises an aggregation of multiple data items.
 28. The apparatusof claim 22, wherein the data block is a fixed-length data block or avariable-length data block.